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CLAIMS 



1. A method for clustering queries, the method comprising: 
^identifying a same document and/or a pluraUty of similar documents 
selected\)i^ a user in response to a plurality of queries; and 

responsive to identifying the same document and/or the similar documents, 
generating a qu^ cluster to indicate that the queries are similar independent of 



whether individual ones of the queries comprise similar composition with respect 
to other ones of the qu^es. 



2. A method as recite^ in claim 1, wherein the queries comprise a well 
formed natural language question,\ keyword, or a phrase. 

3. A method as recited in claini 1, wherein the query cluster is used to 
disambiguate a word or phrase in a query of me queries. 



4. A method as recited in claim 1, further comprising determining that 
the queries are similar based on similar keyword or phrkse composition. 
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5. A method as recited in claim 1, wherein identifying the same 
document and/or the similar documents further comprises: 

ietermining the similar documents by evaluating a set of selected similar 
documents chosen responsive to queries p and q of the queries, wherein 
documents jD_C(^j is a subset of a result list D(.) according to the following: 



C(p) = { dpi , dp2 , ... , dpi} cD(p) 



= {dgi, dq2 , ... , dqj} cD(q); 
wherein simriarity based on selection of documents is based on: 
If DS(P) nrKC(q) = { } ^ 0, then documents dp^i , 

dpq2 , ... . dpqk represent a^et of common topics of queries p and q, and, 

whereby the similarMocuments between queries p and q is determined by 
D_C(P) nDj:(q), 



6. A method as recited m claim 1, further comprising constructing a 

yr^ets. 



thesaurus comprising a plurality of sy 
more query clusters. 



wherein each synset comprises one or 



7. A method as recited in claim \l, wherein identifying the same 
document and/or the similar documents further wmprises determining the similar 
documents based on a proportionality of coi^monly selected individual 
documents. 
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^. A method as recited in claim 7, wherein identifying the same 
document 2Hid/or the similar documents further comprises: 

determining the similar documents based on a proportionality of commonly 

selected individual dobuments, such that: 

.\ . / . RD(p,q) 

simihrny Jp, q) = , 

X - Max(rd(p), rd(q)) 

wherein rd{.) is the number^f clicked documents for a query of the queries, 
and wherein RD{p, q) is the number of document selections in common. 



9. A method as recited in claim iXwherein identifying the same 
document and/or the similar documents further comprises: 

determining the similar documents based on a hierarchical positioning 
between individual ones of a plurality of documents commonly^^lected across the 
queries. 
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A method as recited in claim 9: 
wHerein F(di, dj) is a lowest common parent node for documents di and dj\ 
wherein L(x) is a level of a node x; 

wherein\L_ro/a/ identifies a total number of levels in a hierarchy; and 

wherein a\imilarity between two documents is defined as follows: 
L(F(d,d.))-\ 
'(''•■^^^ L Total -X 

s(di, d) =\l ; and s(di, = 0 if F(di, dj) = root; and 
the method furthencomprises: 

incorporating\fc/^ dj) into a calculation of query similarity, wherein. 
di (l<i<m) and dj (l<j<n)\Q a set of selected documents for queries p and q 
respectively such that: 



similarity ^.^^^^^^y (p.q) = ~x 



m \ " m 

Y(mhL s(d,,dj)) 2L( max s(d.,dj)) 
rdM ^d(q) 



11. Computer-readable media \comprising computer-executable 
instructions for identifying similar queries, the\computer-executable instructions 
comprising instructions for: 

identifying a same document and/or a plikality of similar documents 
selected by a user in response to a plurality of queries; and 

responsive to identifying the same document andA)r the similar documents, 
generating a query cluster to indicate that the queries are\imilar independent of 
whether individual ones of the queries comprise similar comp^osition with respect 
to other ones of the queries. 



Iee®hayes poc so9-324-92S6 



41 



MSl'936US.PATAPP 



V 



12, Computer-readable media as recited in claim 1 1 , wherein the queries 
compriseya well formed natural language question, a keyword, or a phrase. 

13, \ Computer-readable media as recited in claim 11, wherein the query 
cluster is used disambiguate a word or phrase in a query of the queries. 

14, Com^ter-readable media as recited in claim 11, wherein the 
computer-executable \nstructions further comprise instructions for determining 
that the queries are simiW based on similar keyword or phrase composition. 



15, Computer-readable media as recited in claim 11, wherein the 
instructions for identifying th^same document and/or the similar documents 
further comprise instructions for: 

determining the similar docuhjents by evaluating a set of selected similar 
documents chosen responsive to quVies. p and q of the queries, wherein 
documents D_C(.) is a subset of a result list DQ according to the following: 
D_C(P) = (dpi , dp2. ... , dpj\D(p) 
D_C(q) = {dgj, d^2. ... , dqj) d^(q); 

wherein similarity based on selection of dWuments is based on: 

If D_C(p) n D_C(q) = { dp^j , dpg2 . ^- , a^k } ^ 0, then documents dp^j , 
dpq2 , ... , dpqk represent a set of common topics of queries p and q, and, 

whereby the similar documents between queries p and q is determined by 
D_C(p) oD_C(q). 
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16. Computer-readable media as recited in claim 11, wherein the 
compute-executable instructions further comprise instructions for constructing a 
thesaurus c^omprising a plurality of synsets, wherein each synset comprises one or 
more query clusters. 



17. Computer-readable media as recited in claim 11, wherein the 
instructions for identimng the same document and/or the similar documents 
further comprise instructiolKis for determining the similar documents based on a 
proportionality of commonly^lected individual documents. 



18. Computer-readable mei^a as recited in claim 17, wherein the 
instructions for identifying the same o^cument and/or the similar documents 
further comprise instructions for: 

determining the similar documents based\n a proportionality of commonly 
selected individual documents, such that: 

similarity ^^^,^_^Jp, q) = 



Hp^ q) 



Max(rdM, rd(q)) 

wherein rd{.) is the number of clicked documents foKa query of the queries, 
and wherein RD{p, q) is the number of document selections in common. 
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19. Computer-readable media as recited in claim 11, wherein the 
instructions for identifying the same document and/or the similar documents 
further Comprise instructions for: 

determining the similar documents based on a hierarchical positioning 
between indiV^dual ones of a plurality of documents commonly selected across the 
queries. 

20. Computer-readable media as recited in claim 19: 
wherein F(diy dj) is^ lowest common parent node for documents di and dj\ 
wherein L(x) is a levelsof a node x; 

wherein LJTotal identifies a total number of levels in a hierarchy; and 
wherein a similarity between^two documents is defined as follows: 



s(d., dj) = — 



, such that 



_Total - In 

s(di, dj) = \\ and s(di, dj)^0 KF(di, dj) = root; and 
wherein the computer-executable instructions further comprise instructions 



for: 



incorporating s(di, dj) into a calculation of query similarity, wherein. 
di (1< i<m) and dj (l<j<n) be a set of selected documents for queries p and q 
respectively such that: 



similarity (p.qj^^x 



^(maoi s(d„dj)) max s(^dj)) 

i=\ ^ ' I y=i ' \ 

rd(p) rd(q) 



leeOhayes poc 509-324-92S6 



44 



mi-936US.PAT.APP 




!1. A computing device comprising: 

a\processor coupled to a memory, the memory comprising computer 
executabieVistructions, the processor being configured to fetch and execute the 
computer-executable instructions for: 

identifying a same document and/or a plurality of similar documents 
selected by a user in response to a plurality of queries; and 

responsive to identifying the same document and/or the similar 
documents, generating\a query cluster to indicate that the queries are similar 
independent of whetherX individual ones of the queries comprise similar 
composition with respect to other ones of the queries. 

22. A computing deviceVas recited in claim 21, wherein the queries 
comprise a well formed natural language question, a keyword, or a phrase. 

23. A computing device as recitecNn claim 21, wherein the query cluster 
is used to disambiguate a word or phrase in a query of the queries. 



24. A computing device as recited in claim 21, wherein the computer- 
executable instructions further comprise instructionsNfor determining that the 
queries are similar based on similar keyword or phrase composition. 
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25. \ A computing device as recited in claim 21, wherein the instructions 
for identifymg the same document and/or the similar documents further comprise 
instructions for: 

determining the similar documents by evaluating a set of selected similar 
documents chosen responsive to queries p and q of the queries, wherein 
documents D_C(.) \% a subset of a result list D(.) according to the following: 

}^D(p) 

D_C(q) ^dqj , dq2> ... , d^} ^D(q)\ 
wherein similarity Wsed on selection of documents is based on: 
If D_C(p) n D_C(qp^ { dpqi , dpg2 ..... dpgk } ^ 0, then documents dp^i , 

dpq2 , ... . dpgk represent a set oK^^ommon topics of queries p and q, and, 

whereby the similar documents between queries p and q is determined by 

Dj:(p) nD_C(q), \ 

26. A computing device as recited in claim 21, wherein the computer- 
executable instructions further comprise instructions for constructing a thesaurus 
comprising a plurality of synsets, wherein isach synset comprises one or more 
query clusters. \ 

27. A computing device as recited in claim 21, wherein the instructions 
for identifying the same document and/or the similar documents further comprise 
instructions for determining the similar documents based on a proportionality of 
commonly selected individual documents. \ 
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28, A computing device as recited in claim 21, wherein the instructions 
for iderm^ing the same document and/or the similar documents further comprise 
instructions^r: 

determinmg the similar documents based on a proportionality of commonly 
selected individual o^uments, such that: 

shnllarity ^,^,,jjp. q) 



RD(p. q) 



Mca(rd(p), rd(q)) 

wherein rd{,) is the nuntber of clicked documents for a query of the queries, 
and wherein RD{p, q) is the numbeKpf document selections in common. 



29. A computing device as recited in claim 21, wherein the instructions 
for identifying the same document and/or the sr^iilar documents further comprise 
instructions for: 

determining the similar documents based on ^hierarchical positioning 
between individual ones of a plurality of documents commonly selected across the 
queries. 
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for: 



30. A computing device as recited in claim 29: 

lerein F(di, dj) is a lowest common parent node for documents 4 and d/, 
wherein L(x) is a level of a node x; 

whereirrZroto/ identifies a total number of levels in a hierarchy; and 

wherein a siqiilarity between two documents is defined as follows: 

^ L(F(d.,dJ)-\ 
s(d,.d^}^ ^_r.to;_l .^"^hthat 

s(di, di) = l \ and 5(5://, (i^ = 0 if F(di, dj) = root; and 
wherein the computer-executable instructions further comprise instructions 



incorporating s(di, dj) intosa calculation of query similarity, wherein. 
di (1< i<m) and dj (1< j<n) be a set of selected documents for queries p and q 
respectively such that: 



^(msix s(d,,dj)) Li^^ s(d,>dj)) 



rd(p) 
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